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Abstract 

We describe sequential and parallel algorithms that ap- 
proximately solve linear programs with no negative coeffi- 
cients (a. La. mixed packing and covering problems). 

For explicitly given problems, our fastest sequential al- 
gorithm returns a solution satisfying all constraints within 
alie factor in 0(md\og(m) / e 2 ) time, where m is the 
number of constraints and d is the maximum number of con- 
straints any variable appears in. 

Our parallel algorithm runs in time polylogarithmic in 
the input size times e -4 and uses a total number of opera- 
tions comparable to the sequential algorithm. 

The main contribution is that the algorithms solve mixed 
packing and covering problems ( in contrast to pure packing 
or pure covering problems, which have only "<" or only 
"> " inequalities, but not both) and run in time independent 
of the so-called width of the problem. 



1. Background 

Packing and covering problems are problems that can be 
formulated as linear programs using only non-negative co- 
efficients and non-negative variables. Special cases include 
pure packing problems, which are of the form max{c • x : 
Ax < b} and pure covering problems, which are of the form 
min{c • x : Ax > b}. 

Lagrangian-relaxation algorithms are based on the fol- 
lowing basic idea. Given an optimization problem speci- 
fied as a collection of constraints, modify the problem by 
selecting some of the constraints and replacing them by a 
continuous "penalty" function that, given a partial solution 
x, measures how close x is to violating the removed con- 
straints. Construct a solution iteratively in small steps, mak- 
ing each choice to maintain the remaining constraints while 
minimizing the increase in the penalty function. 

While Lagrangian-relaxation algorithms have the disad- 
vantage of producing only approximately optimal (or ap- 
proximately feasible) solutions, the algorithms have the fol- 



lowing potential advantages in comparison to the simplex, 
interior point, and ellipsoid methods. They can be faster, 
easier to implement, and easier to parallelize. They can be 
particularly useful for problems that are sparse, or that have 
exponentially many variables or constraints (but still have 
some polynomial-size representation). 

Lagrangian relaxation was one of the first methods 
proposed for solving linear programs — as early as the 
1950's, John von Neumann apparently proposed and ana- 
lyzed an 0(m 2 n log(mn)/e 2 )-time Lagrangian-relaxation 
algorithm for solving two-person zero-sum matrix games 
(equivalent to pure packing or covering) [Q. The algo- 
rithm returned a solution with additive error e assuming the 
matrix was scaled to lie between and 1. In 1950, Brown 
and von Neumann also proposed a system of differential 
equations that converged to an optimal solution, with the 
suggestion that the equations could form the basis of an al- 
gorithm [H]. 

Subsequent examples include a multicommodity flow al- 
gorithm by Ford and Fulkerson (1958), Dantzig-Wolfe de- 
composition (1960), Benders' decomposition (1962), and 
Held and Karp's lower bound for the traveling salesman 
problem (1971). In 1990, Shahrokhi and Matula proved 
polynomial-time convergence rates for a Lagrangian- 
relaxation algorithm for multicommodity flow. This caught 
the attention of the theoretical computer science research 
community, which has since produced a large body of re- 
search on the subject. Klein et al. [ p"5[ | and Leighton et 
al. [ 19 1 (and many others) gave additional multicommodity 
flow results. Plotkin, Shmoys, and Tardos [ ^l| ] and Grigori- 
adis and Khachiyan Jl0| , |ljL H ^ adapted the techniques to 
the general class of packing/covering problems, including 
mixed packing and covering problems. These algorithms' 
running times depended linearly on the width — an un- 
bounded function of the input instance. Relatively compli- 
cated techniques were developed to transform problems so 
as to reduce their width. 

From this body of work we adapt and use the follow- 
ing specific techniques: the technique of variable-size in- 
crements (Garg and Konemann fl7l [17p); a way of partition- 



ing the steps of the Garg/Konemann algorithm into phases 
(Fleischer and the idea of incrementing multiple vari- 
ables simultaneously (Luby and Nisan [p0|]). 

Variable-sized increments yield algorithms whose run- 
ning times are independent of the width of the problem 
instance, effectively replacing the width by the number of 
constraints. Partitioning into phases reduces the time to im- 
plement each step. Finally, incrementing multiple variables 
simultaneously allows fast parallel algorithms. 

Previously, as far as we know, these techniques have only 
been applied to pure packing or covering problems, not to 
mixed packing and covering problems. We know of no 
other width-independent or parallel algorithms for mixed 
packing and covering. Our contribution here is to present 
such algorithms. 

Although Luby and Nisan characterize their algorithm as 
solving "linear programs with non-negative coefficients", in 
fact it applies only to pure packing or pure covering prob- 
lems [pO]. 

After presenting and analyzing the algorithms, we con- 
clude with two illustrative examples. 

2. Mixed Packing and Covering 

We consider problems in the following form: 
Approximate Mixed Packing and Covering: Given 
non-negative matrices P,C, vectors p, c and e G (0,1), 
find an approximately feasible vector x > (s.t. Px < 
(1 + e)p and Cx > c) or a proof that no vector x is feasible 
(i.e., satisfies x > 0, Px < p and Cx > c). 

In Section || we describe how to reduce the optimization 
version min{A : Px < A p, Cx > c} to the above form. 

For notational simplicity, we assume throughout that 
each coordinate of p and c is some constant N, This is 
without loss of generality by the reduction Pj . = Py -N/pi 
and = CijN/ci (after removing any constraints of 
the form (Px); < — which only force to zero each 
Xj such that P^ > — or the form (Cx); > — 
which do not constrain x at all). A vector x is feasible if 
max Px < N < min Cx. 

All of the algorithms in this paper are specializations of 
the generic algorithm in Fig. |[ The algorithm starts with an 
infeasible vector x and adds to x in small increments until x 
becomes approximately feasible, that is, until min Cx > N 
and maxPx < (1 + 0(e))N. Instead of working with 
the max and min functions, the algorithm works to achieve 
a stronger condition: lminCx>A^ and ImaxPx < (1 + 
0(e))N, where Imax and Imin are "smooth" functions that 
approximate max and min: 

Definition 1 For real values y = (yi, y2, . . . , y- m ), define 
Imax y = ln£ j; e yi and Imin y = — ln^e~ yi . 



Afc>fe max y < Imax y anc/ min y > Iminy. 

Recall that, for any continuous function /(x), increas- 
ing Xj by 5 increases / by approximately 6 times the partial 
derivative of / with respect to Xj . In lines 2 and 3 of the 
algorithm, partial^ (P, x) and partial^ (C, — x) are, respec- 
tively, the partial derivatives of Imax(Px) and Imin(Cx) 
with respect to Xj. Thus, the condition in line 7b says that 
a variable Xj may be increased only if doing so increases 
Imax(Px) by at most 1 + 0(e) times as much as it in- 
creases Imin(Cx). We say "1 + 0(e)" instead of 1 + e 
because the partial derivatives only approximate the actual 
increases, and in fact the condition on line 7c is necessary 
to ensure that the change in x is small enough so that the 
partial derivatives do give good approximations. 

Because each step increases Imax(Px) by at most 1 + 
O(e) times as much as it increases Imin(Cx), the ratio of 
the two quantities tends to 1 + 0(e) (or less). Thus, the 
algorithm drives x to approximate feasibility. 

Line 5 ensures that there is a j meeting the condition of 
line 7b. Why is line 5 okay? Briefly, because the gradients 
of Imax and Imin have 1-norm equal to 1 (that is, the sum 
of their partial derivatives equals 1), one can show that, for 
any x and any feasible x*, the dot product of x* with the 
gradient of Imax(Px) is at most N, while the dot product of 
x* with the gradient of Imin(Cx) is at least N. Since x* > 
0, this means at least one partial derivative of Imin (Cx) is as 
large as the corresponding partial derivative of Imax(Px). 

In the remainder of this section we give the complete 
analysis of the performance guarantee. 

Let partially) = e yi / ' £; e yi > so mat partial i (y) is 
the partial derivative of lmax(y) with respect to y; and 
partial^ ( — y) is the partial derivative of lmin(y) with re- 
spect to y;. We will use the following "chain rule": for 
any M, x, a, 

Ei(Ma)< partial^(Mx) = £\ a, partial^M, x) (1) 
We start with a utility lemma. 

Lemma 1 (smoothness of Imin and Imax) For all y,f3 > 
0, ifO < fa < e < 1 then 

lmax(y + (3) < lmax(y) + (1 + e) £ 4 [3, partially) 

and 

lmin(y + (3) > lmin(y) + (1 - e/2) £V ft partia^(-y). 

Proof: Using the standard sorts of inequalities that underlie 
Chernoff bounds, namely ln(l + z) < z (for all z) and 

e 13 - 1 < (1 + e)j3 (for < < e < 1): 

lmax(y + 0) — lmax(y) 

< Ei(e^ - l)e yi /]T\e y< 

< (l + «0£i ft partially). 
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in: P, C, e, arbitrary x 

out: 'infeasible' or x s.t. Px < (1 + 0(e))N, Cx > N. 

1 . Let N <— (max Px + 2 In m)/e, where m is the number of constraints. 



2. Define partiaL,(M, x) = £i Mjje^/ Ei e< Mx )\ - 

3. Define ratio,,(x) = partial^P, x)/ partial^C, — x). 

4. While min Cx < iV do: 

5. If mux, ratioj(x) > 1 then return 'infeasible'. 

6. Delete the ith row of C for each i s.t. (Cx); > N. 
7a. Choose "increment" vector a. > such that 

7b. (Vj) atj > only if ratio.,- (x) < 1 + e 

7c. and maxjmax Ca, max Pa} = e. 

8. Letx^x + a. 

9. Return x. 



-partial derivative o/lmax and Imin 



— see Lemma [3| 
— constraint deletion for efficiency 

d Imax /cbc,- < (1 + e)<9 Imin /cbc,- 

— step size 
— do the increment 



Figure 1. Generic algorithm. Implementable in 0(mlog(m)/e 2 ) linear-time iterations. 



This proves the first inequality in the statement of the 
lemma. The second inequality follows by an analogous 



chain of inequalities, using 1 

< < e < 1). 



-0 



> (1 - e/2)/3 (for 
□ 



Because of this smoothness, if the increment is small 
enough (i.e. the"step-size" condition on line 7c of the algo- 
rithm is met) the partial derivatives approximate the changes 
in Imin and Imax well — within a 1 ± e factor: 

Lemma 2 In each increment, the increase of Imax Px is at 



most 



l-e/2 



times that of Imin Cx. 



Proof: When the generic algorithm increments x by a, the 
vector a meets the following conditions: 

1. (Vj) OLj > -> ratioj (x) < 1 + e; 

2. max{maxCa, maxPa} < e. 

Adding a to x adds Pa to Px. From Condition 2 above 
and Lemma [jl it follows that ImaxPx increases by at most 
(1 + e ) E 1 ( Pq 0* partial -(Px). By the chain rule this 
equals (1 + e) £\ atj partial^P, x). 

Similarly, IminCx increases by at least 
(1 — e/2) J2j a j partial 3 -(C, — x). Since aj > only if 

(Condition 1 

□ 



partial., (P,x) < (1 + e) partial^ (C, -x) 
above), Lemma || follows. 



Next we show that if the problem instance is feasible 
there always exists a choice of a meeting the conditions 
on lines 7b and 7c of the algorithm. This is necessary for 
the algorithm to be well-defined. 

Lemma 3 If the problem instance is feasible, then Vx3j : 
ratioj(x) < 1. 



Proof: Let x be arbitrary and let x* be a feasible solution. 
By the chain rule ([[]), 

£ 3 x* partial, (P,x) = ^(Px*); partial^Px). 

Since (Px*) 4 < N, and partial^Px) = 1, the quantity 
above is at most N. 

Likewise, £\ x* partial^C, -x) > N. 

Since x* > 0, there must be some j such that 
partial -(C, -x) > partial -(P,x). □ 

Lemma ^] means that the condition on line 7b can be met, 
and clearly by scaling the condition on line 7c can also be 
met. 

From Lemma |[ the basic performance guarantee follows 
easily. 

Lemma 4 If the problem instance is feasible, the generic 
algorithm returns an approximately feasible solution. 
Given an initial x, the algorithm makes 0(m(maxPx + 
logm)/e 2 ) increments. 

Proof: First we prove the performance guarantee. Define 



<1> = ImaxPx — 



(i+er 



Imin Cx. 



l-e/2 

Before the first increment $ < ln(m e maxPx ) + (1 + 
0(e)) mm < O(Ne). By Lemma ^, no increment oper- 
ation increases <t>. Deleting a covering constraint increases 
Imin Cx and therefore only decreases <£>. Thus, $ < O(eN) 
throughout the course of the algorithm and, just before the 
last increment (when Imin Cx < min Cx < N), 

ImaxPx < 0(eN) + (l + 0(e)) IminCx 
< (l + 0(e))N. 

With the last increment, max Px increases by at most e, so 
at termination, maxPx < (1 + 0(e))N while minCx > 
N. 
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in: P, C, e, arbitrary x 

out: 'infeasible' or x s.t. Px < (1 + 0(e))N, Cx > N. 

1. Let N <— (max Px + 2 In m)/e, where m is the number of constraints. 

2. Define localj(x) = J2i P^e*^ 1 / J2i Cije~^ Cx ^. — terms o/ ratio,,- (x) that depend on j 

3. Define global(x) = e (Px) V Ei e" (Cx)l . —so ratio^x) = local,- (x)/ global(x) 

4. While min Cx < iV do: 

5a. If g is not yet set or min., local,-(x)/g > 1 + e then 

5b. let g <— global (x), and — start new phase 

5c. if mm,- local,-(x)/g > 1 then return 'infeasible'. 

6. Delete the ith row of C for each i s.t. (Cx); > N. 
7a. Choose "increment" vector a > such that 

7b. (Vj) ctj > only if local,(x)/g < 1 + e — stronger cond'n than in generic alg. 

7c. and maxjmax Ca, max Pa} = e. 

8. Let x <— x + a. 

9. Return x. 

Figure 2. Algorithm with phases. Implementable in 0(mdlog(m)/e 2 ) operations, where d is the maxi- 
mum number of constraints any variable appears in. 



Next we bound the number of increments. Let m c and 
m p be the number of rows of C and P, respectively, so that 
m c +m p = m. Define * = E,:( Px )*+E 4 (( Cx )*-^- e )- 
It is initially at least —m c (N + e), and finally at most 
m p (N + e). By the "step-size" condition in line 7c, each in- 
crement increases by at least e. Because of the constraint- 
deletion operations in line 6, (Cx)i < N before each in- 
crement and so (Cx); < N + e after each increment (for 
each row i remaining in C). Thus, each constraint dele- 
tion increases ^. Thus, the number of increments is at most 
m(N+e) j e, which gives the desired bound by the definition 
of N. □ 

To specify a particular implementation of the algorithm, 
we need to specify how the initial x is chosen and how 
the increment a is chosen in each iteration. Here is one 
straightforward implementation: initialize x to 0, and with 
each increment choose a to be a vector where a.j = 
for all j except for a single j such that partial^ (P, x) < 
(1 + e) partial^ (C, — x). The value of that aj is determined 
by the step-size condition. This still leaves some flexi- 
bility. For example, one can choose j so as to minimize 
partial^ (P, x) — partial^ (C, — x) or to minimize (within 
1 + e) partial^ (P, x)/ partial^ (C, — x). Clearly, if the prob- 
lem instance is given explicitly, then either of these choices 
can be implemented in time linear in the number of non- 
zero entries in the matrices. This gives the following corol- 
lary: 

Corollary 1 The generic algorithm can implemented to ap- 
proximately solve any explicitly given problem instance 
in 0(mlog(m)/e 2 ) linear-time iterations, where m is the 



number of constraints. 

3. Algorithm with Phases 

This algorithm specializes the generic algorithm. In 
order to speed the computation of the key function 
ratio^x), we break it into two components as ratio^x) = 
localj(x)/ global(x), where localj captures the terms that 
depend on j. As x changes, we recompute global(x) only 
occasionally — at the start of each phase (i.e., iteration of 
the outer loop). The algorithm is shown in Fig. ||. First 
we discuss how this is a particular implementation of the 
generic algorithm, and then we prove a stronger time bound. 

Lemma 5 The algorithm with phases is a specialization of 
the generic algorithm. 

Proof: We argue that any increment the algorithm does 
is also an allowable increment for the generic algorithm. 
Since local, (x)/global(x) = ratio^x) and localj(x) and 
global(x) only increase as the algorithm proceeds, at all 
times localj (x)/g > ratio^x) . Thus, any a meeting the 
conditions of the algorithm with phases also meets the con- 
ditions of the generic algorithm. q 

This and Lemma [| imply the performance guarantee: 

Corollary 2 The algorithm with phases returns an approx- 
imately feasible solution. Given an initial x, the algorithm 
makes at most 0{m (max Px + log m) /e 2 ) increments. 

Now here is the stronger time bound: 
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in: P,C,e 

out: 'infeasible' or x s.t. Px < 
0. Let x,- 
1. 
2. 
3. 
4. 



[l + 0(e))N, Cx > N. 



mirij l/(nPij) for each j, where n is the # of var's. 
Let N <— (max Px + 2 In m)/e, where m is the # of constraints. 



-x initialized, not given 



Define local, (x) = J2 t 
Define global(x) = J2 
While min Cx < N do: 



P e 

. p(Px), 



(Px). 



27 1 

-(Cx), 



-(Cx), 



5a. If <? is not yet set or mim. localj (x)/g > 1 + e then 
5b. let g <— global(x), and 

5c. if min,- localj(x)/g > 1 then return 'infeasible'. 

6. Delete the ith row of C for each % s.t. (Cx)< > N. 
7a. Choose "increment" vector ct > such that for some <5 > 
7b. (Vj) ctj = Xj/5 if localj-(x)/<7 < 1 + e, else e*j = 0, 

7c. and maxjmax Ca, max Pa} = e. 

8. Let x <— x + ck. 

9. Return x. 

Figure 3. Parallel algorithm. Implementable in parallel time polylogarithmic in input size times e 



- mcr. all allowed Xj 's, 
-prop, to current value 



Lemma 6 Given an initial x, f/ze algorithm with phases 
uses 0((max Px + log m) /e 2 ) phases. 

Proof: We claim that global(x) increases by at least a 1 + e 
factor each phase. By inspection, global(x) is initially at 
least l/m and finally at most me°( N ) /e~°( N \ so the result 
will follow. 

To see the claim, note that at the end of a phase, 
localj(x)/(7 > 1 + e for all j, i.e. g < min,- localj(x)/(l + 
e) . But, by Lemma ||, at the start of the next phase, the same 
x and the next g satisfy localj(x)/g = ratioj(x) < 1 for 
some j, i.e. g > minj loca^x). rj 

Consider the following "round-robin" implementation of 
the algorithm. Start with x = 0. Implement each phase by 
cycling through the indices j once. For each j, as long as 
localj(x)/g < 1 + e, repeatedly increment x by the vec- 
tor a that has all coordinates except atj, whose value is 
determined by the step size. 

Maintain the values (Px); and (Cx), for every i. After 
a variable Xj is incremented, the only values that change are 
those where or Cy are non-zero. So maintaining these 
values requires 0(d) time, where d < m is the maximum 
number constraints any variable appears in. 

With these values in hand, the condition localj(x)/<7 < 
1 + e can be checked in 0(d) time. Since the number of 
increments is 0(mlog(m)/e 2 ), the total time to do incre- 
ments is 0(md\og(m) / e 2 ). Other than increments, each of 
the 0(log(m)/e 2 ) phases requires 0(md) time, so we have 
the following corollary: 

Corollary 3 The algorithm with phases can be imple- 
mented to approximately solve any explicitly given problem 



instance using 0(md\og(m) /e 2 ) operations, where d is the 
maximum number of constraints any variable appears in. 

Note that this is an improvement on Corollary by a fac- 
tor equal to the number of non-zero entries in the matrix, 
divided by d (this factor can be as large as the number of 
variables). 

4. Parallel Algorithm 

Next we further specialize the algorithm to achieve an 
efficient parallel implementation. The algorithm is shown 
in Fig. H The idea is that we start with each variable having 
a small but positive value. Then, in each increment step, we 
increase all allowed variables (line 7b), each proportionally 
to its current value. This method allows us to give a poly- 
logarithmic bound on the number of iterations per phase. 

Lemma 7 The parallel algorithm is a specialization of the 
algorithm with phases, starting with max Px < 1. The par- 
allel algorithm makes 0(log(m)\og(n\og(m)/e)/e 2 ) in- 
crements per phase. 

Proof: The first claim is true by inspection and the initial 
choice of x. 

It remains to bound the number of increments per phase. 
First, we claim that in each increment 5 = Sl(N/e). This 
is simply because by the choice of a, for some i, (Cx)i/S 
or (Px)i/5 is at least e, but (Cx), and (Px), are O(N) 
throughout the algorithm. Thus, for each j, each increment 
that increases x 3 increases it by at least a 1 + fl(e/N) fac- 
tor. Since x 3 is initially min; 1 / nPy and finally at most 
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min* N/Pij, it follows that at most 0(N \og(Nn)/e) in- 
crements increase Xj . 

Finally, in each phase, the last increment of the phase in- 
creases some Xj . In fact each increment in the phase must 
have increased that Xj, since local., (x) only increased dur- 
ing the phase. Thus, the number of increments in the phase 
isO{N\og(Nn)/e). ' □ 

Since each increment can be implemented in parallel 
in polylogarithmic time, and the number of increments is 
bounded by the number of phases times the number of in- 
crements per phase. We have the following corollary. 

Corollary 4 The parallel algorithm can be implemented to 
approximately solve any explicitly given problem instance 
in parallel time polylogarithmic in the input size times 1 /e 4 . 

5. Reducing Optimization to Feasibility 

Given a problem instance P, p, C, c and e > 0, let A* = 
min{A : (3x) Px < Ap,Cx > c}. In this section we 
describe how to use the algorithms in this paper to approx- 
imately solve this optimization problem — that is, to com- 
pute a feasible solution (A, x) such that A* < A < (l+e)A*. 

We reduce the optimization problem to a sequence of 
approximate feasibility subproblems. Each subproblem re- 
quires e'-approximately solving 3?x : Px < A' p, Cx > c 
for a particular e' and A'. We can solve such a subproblem 
using any of the algorithms in this paper. 

Lemma 8 The approximate optimization problem reduces 
to a sequence of approximate feasibility subproblems: 
O(loglogm) subproblems with e' — 1/2 and 0(logl/e) 
subproblems where the ith-to-last subproblem has e' = 

mm 

Note that if we solve the subproblems using any algo- 
rithm whose time depends at least linearly on 1/e (such as 
the ones in this paper), then the total time to solve the sec- 
ond set of subproblems (those with e' < 1 /2) dominated by 
the time used to solve the last such subproblem. 

The remainder of this section contains the proof. The 
basic idea is to use binary search for A*, solving a feasibility 
problem at each step to bound A*. This lemma gives starting 
upper and lower bounds: 

Lemma 9 Let A = E; min j Ej'( p i'jM0/( c y/ c i)- 
Then A* < A < m 2 A*. 

Proof: Recall A* = min{A : (3x) Px < Ap, Cx > c}. 
For each i = 1, . . . , m c , consider the following relaxation: 

A* = min{A : (3x) Ei'( Px )i'M' < ™p A > ( Cx )* > c i}- 

That is, only one specified covering constraint, and the 
sum of the packing constraints, need to hold. Clearly 



A* < A*. Furthermore the optimal solution z(i) to the 
ith relaxed problem is given by finding j that minimizes 
(Pi? /Pi') I '(Cy / 'cj) and setting all coordinates of z(i) 
to zero except Zj (i) = l/(Cjj/cj). Correspondingly A* = 

To get an m 2 -approximate solution to the original prob- 
lem, take x = Ei z W an d ^ = Hi m pK- The pair 
(x, A) is a feasible solution to the original problem (each 
covering constraint is met the contribution of the corre- 
sponding z(i), while each packing constraint is met because 
(Px)i</pi< < A). Finally, A < m 2 A* because each 
A*<A*. ' ' n 

Take A = A/to 2 (for A as in the lemma) so 1 < 
A*/A < to 2 . Next use binary search to find an integer 
j such that 2 J < A*/A < 2 J+1 . Given an arbitrary i, to 
decide whether i < j, solve the feasibility subproblem tak- 
ing A' = A 2* and e' = 1/2. If there is an approximate 
solution then A* < (1 + e')A' < A 2 m and hence i < j. 
Otherwise the problem is infeasible so A* > A' = A 2* and 
hence i > j. Since there are O(logTO) possible values of 
i, the binary search takes O(loglogTO-) subproblems each 
with e' = 1/2. 

We have now computed Ai = A 2 : ' such that 1 < 
A*/Ai < 2. Next we increase the precision. We start the ith 
step with Ai such that A*/A^ e [1,1 + 5*] for some Si > 0, 
solve the feasibility problem taking A' = A^(l + Si/ 4) 
and e' = Si/ 4. If there is an approximate solution then 
A*/Ai G [1, (1 + Si/4) 2 }, so take X i+1 = A^. Otherwise the 
problem is infeasible, implying A*/Ai £ [1 + Si/4, 1 + Si], 
so take Aj+i = A*(l + Si/ 4). In either case a calculation 
shows that A*/A i+ i £ [1,1 + S i+1 ] for S i+1 = (3/4)5*. 
Before O (log 1/e) steps, 5* < e, at which point the most 
recent solution produced will be e-optimal. 

The second phase (increasing the precision) requires 
solving 0(log 1/e) subproblems, but, because e' decreases 
geometrically in each step, the time to solve the subprob- 
lems is dominated by the time to solve the final subprob- 
lem (with S = 0(e)). Thus, the entire computation time 
is O (log log m) times the time to solve a feasibility prob- 
lem with e' = 1/2 plus the time to solve a single feasibility 
problem with e' = e. 

6. Examples 

6.1. Min-Cost Concurrent Multicommodity Flow 

This section illustrates how to handle problems with ex- 
ponentially many variables. 

An instance of the min-cost concurrent multicommodity 
flow problem is defined by a weighted, capacitated, directed 
graph G, a collection of commodities C\, C2, Ck, and a 
demand d* > for each commodity. Each commodity C* is 
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in: weighted, capacitated digraph G, commodities, {(sj, U, di)}, budget W. 

out: 'infeasible' or / s.t. /(e) < fi e (l + 0(e)), f(s h U) >d h wf<(l + 0(e))W 

0. Initialize f(j>) <— for all p. 

1. Let N <— 2 ln(m)/e, where m = 1 + hedges + ^commodities. 

2. Define local p (/) = [w(p)e w ' / / w /W + Y,ee P z m/ile I Ve\/[e~ Ksuti)/di /h]. 

3. Define global(/) = \e w '^ w + E e e /(e)Ate ]/[E* e^'*^*]. 

4. Until each commodity's demand is exceeded by a factor of N do: 
5a. If g is not yet set or min p local p (/)/g > 1 + e, then 

5b. let g <— global(/), and 

5c. if mhip \oca\ p (f)/g > 1 then return 'infeasible'. 

6. Delete any commodity whose demand is exceeded by a factor of N. 

7. Choose any commodity i and path p for it s.t. local p (/)/g < 1 + e. 

8. Set f(p) <— f(p) + 5, where 5 = eminjdj, VF/w(p), min eep /i e }. 

9. Return f /N. 

Figure 4. Algorithm with phases, applied to min-cost concurrent multicommodity f low. Implementable 
in time bounded by time to solve 0(mlog(m)/e 2 ) shortest-path subproblems, where m is the number 
of edges plus the number of commodities. Note: /(s;, U) denotes the flow shipped for commodity i. 



the set of paths from some source vertex Si to a sink vertex 
ti. We also assume we are given a budget W > 0. 

A solution is a multicommodity flow /, consisting of a 
network flow fi for each commodity Cj. We think of / as 
specifying a flow fi(p) > for each path p G Ci, but it 
also induces a flow fi(e) — Epae fi(p) on each edge e. 
Without loss of generality, no path is in two commodities, 
so we drop the subscript i from fi(p). 

Let w e and /i e denote the weight and capacity of edge 
e, respectively. Define the weight of path p to be w p — 

J2eep We > an( ^ th e we ig nt °f fl° w / to be w ■ / = 
J2 p w p f(p). For the total flow on an edge e or path p, 
we use /(e) = ^2 p3e f{p)- The amount of commodity i 
shipped is E P ec, AP)- 

A solution / is feasible if: the amount of commodity i 
shipped is at least dj, the flow on each edge is within the 
capacity (/(e) < /i e ), and the weight of the flow is within 
the budget (w ■ f < W). An approximate solution, given 
e > 0, is one where \fi\ > (1 — e)di, /(e) < (1 + e)/x e , and 
w-f<(l + e)W. 

As described, the problem is naturally a mixed pack- 
ing/covering problem with a variable f(p) for each path p 
and with the following constraints: (Vi) E p ec /(p) — 
(Ve)E p3e /(P)<Me,Ep/(PK<^ 

The simple implementation of the algorithm with phases 
reduces in this case to the algorithm in Fig. ^. As presented 
in the figure, the algorithm uses exponentially many vari- 
ables (one for each path). However, to implement the al- 
gorithm it suffices to maintain only the flow for each edge 
and commodity and the total cost. To implement the in- 
ner loop, do the following for each commodity i: repeat- 



edly find the shortest path p from Si to U in the graph with 
edge weights given by i{e) = w e e w '^ w + e^ 1 ^^' / p, e . 
If the length of p is at most (1 + e)g e~^ SiM)/di / di, then 
local p (/)/g < (1 + e), so augment flow onp as described 
in the figure, otherwise, move on to the next commodity. 

The time the algorithm takes is bounded by the short- 
est path computations. The number of these is equal to 
the number of increments plus at most one per commodity 
per phase. The number of increments is 0(m log(m)/e 2 ), 
while there are 0(log(m)/e 2 ) phases and 0(m) commodi- 
ties. Thus there are 0(m log(ra)/e 2 ) shortest path compu- 
tations. 

6.2. X-Ray Tomography / Linear Equations 

Computer tomography (a.k.a. x-ray tomography or 
Radon transform) is a special case of mixed packing and 
covering. Briefly, x-rays are taken of an object from many 
directions, and the internal structure (density at each point) 
of the object is reconstructed from the results. 

For illustration, consider the following simple case. As- 
sume an object resides within annxiiXB cube. Discretize 
the cube by partitioning it into n 3 1 x 1 x 1 subcubes in the 
obvious way. Enumerate the subcubes in some order and 
introduce a variable Xj representing the density of the jth 
subcube. 

Take d x-ray snapshots of the object from different direc- 
tions. With current techniques, n is typically a few hundred 
and d = Q(n 2 ) so that enough information is gathered to 
reconstruct a single solution. Even two-dimensional recon- 
struction problems are useful, as a volume can be recon- 
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in: A, e 

out: 'infeasible' or x s.t. 1 < Ax < 1 + 0(e). 

0. Let Xj = mirij l/(nAy), where n is the number of variables. 

1 . Let N <— (1 + 2 In m)/e, where m is the number of constraints. 

2. Define local^x) = £i A^e^/Ei A^e-^'. 

3. Define global(x) = e (Ax) V Ei e~ (Ax)l . 

4. While min Ax < iV do: 

5a. If g is not yet set or mm, \oca\j(x)/g > 1 + e, then 
5b. let g <— global(x), and 

5c. if minj \oca\j(x)/g > 1 then return 'infeasible'. 

6. Delete the zth row of A for each % s.t. (Ax); > N. — unnecessary, as Ax < O(N) 
7a. Compute a and then 5 such that 

7b. olj = Xj if localj(x)/p < 1 + e and ctj = otherwise, 

7b. and 5 = maXj(Aa)j. 

8. Setx <— x + ea/5. 

9. Return x/JV. 

Figure 5. Parallel algorithm as it specializes to approximately solve Ax = b (in normalized form 
Ax = 1). Used in the x-ray tomography example. Runs in 0(log(m)/e 2 ) phases, makes 0(mlog(m)/e 2 ) 
increments, and runs in time 1/e 4 times polylogarithmic in n and m. 



structed in slices. 

Assume each x-ray produces annxn image. Discretize 
the image into its n 2 squares in the obvious way. Enumerate 
all dn 2 squares of all snapshots in some order. For the ith 
square, compute from the darkness of the square the total 
mass fii of the matter that the x-rays aimed at that square 
passed through. Add a constraint of the form ^ . Xj Ajj = 
1 where A, j is the volume of the intersection of cube j and 
the cylinder of x-rays aimed at square i, divided by /Xj. (If 
Hi = 0, delete all variables Xj such that the Ajj > 0.) 

The reconstruction problem is to find x > such that 
Ax = 1. The approximate version is to find x such that 
Ax > 1 and Ax < 1 + 0(e). The problem has 0(dn 2 ) 
constraints, and each variable occurs in d constraints. 

Variables for cubes known to be outside the object can 
be deleted. If appropriate, additional constraints such as 
Xj < 1 (to constrain the maximum density) can be added. 

Without these additional constraints, the problem is a 
special case of approximately solving a system of equations 
with non-negative coefficients: given A, finding x > such 
that Ax = 1. The parallel algorithm, as it specializes for 
this problem, is shown in Fig. || Note that deletion of satis- 
fied covering constraints can be omitted and the analysis of 
the algorithm will still hold, because the packing constraints 
ensure that no covering constraint exceeds its upper bound 
by more than an 0(1) factor. 

The total work done by this algorithm is more than 
with traditional methods for x-ray tomography (filtered 
back-projection and Fourier reconstruction). However, this 



method may be easier to parallelize. It is more flexible, in 
that additional constraints can be added. In some cases, for 
example when directions from which the snapshots can be 
taken are constrained, traditional methods suffer from ill- 
conditioning, whereas this approach may not. 

7. Final Remarks 

Open problem: find an efficient width-independent 
Lagrangian-relaxation algorithm for the abstract mixed- 
packing covering problem: 

Find x : Px < (1 + e)p, Cx > (1 - e)c, xeP 

where V is a polytope that can be queried by an optimiza- 
tion oracle (given c, return x G V minimizing c • x) or 
some other suitable oracle. The min-cost multicommod- 
ity flow example earlier in the paper is a special case. Al- 
though that example illustrates how to deal with exponen- 
tially many variables, the polytope in that example is the 
degenerate one {x : x > 0}. A polytope that illustrates the 
difficulty of the general case is V = {x : Ej x j = !}• The 
difficulty seems to be using variable-size increments with 
three constraints: the packing constraints, the covering con- 
straints, and the constraint of staying in the polytope. 

Find a parallel algorithm whose number of iterations is 
polylogarithmic in the number of constraints, even if the 
number of variables is exponential. Find a parallel algo- 
rithm whose running time has an e~ 2 or e~ 3 term instead of 
the e~ 4 term. 
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The algorithms in this paper handle any pure packing or 
covering problem as a special case. In this case the algo- 
rithms simplify somewhat, so that they can handle the opti- 
mization versions of the problems directly. 

The algorithms can be viewed as derandomizations (us- 
ing the method of conditional probabilities) of natural ran- 
domized rounding schemes (see | ^3j [24ft for this approach). 
Lower bounds on the number of iterations required by 
Lagrangian-relaxation algorithms are presented in Jig]. 

Thanks to Lisa Fleischer for useful discussions. 
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